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To better understand the genetic diversity and genomic features of 41 coronaviruses (CoVs) identified 
from Kenya bats in 2006, seven CoVs as representatives of seven different phylogenetic groups identified 
from partial polymerase gene sequences, were subjected to extensive genomic sequencing. As a result, 
15-16 kb nucleotide sequences encoding complete RNA dependent RNA polymerase, spike, envelope, 
membrane, and nucleocapsid proteins plus other open reading frames (ORFs) were generated. Sequences 
analysis confirmed that the CoVs from Kenya bats are divergent members of Alphacoronavirus and Beta- 
coronavirus genera. Furthermore, the CoVs BtKY22, BtKY41, and BtKY43 in Alphacoronavirus genus and 
BtKY24 in Betacoronavirus genus are likely representatives of 4 novel CoV species. BtKY27 and BtKY33 
are members of the established bat CoV species in Alphacoronavirus genus and BtKY06 is a member of the 
established bat CoV species in Betacoronavirus genus. The genome organization of these seven CoVs is 
similar to other known CoVs from the same groups except for differences in the number of putative ORFs 
following the N gene. The present results confirm a significant diversity of CoVs circulating in Kenya bats. 
These Kenya bat CoVs are phylogenetically distant from any previously described human and animal 
CoVs. However, because of the examples of host switching among CoVs after relatively minor sequence 
changes in SI domain of spike protein, a further surveillance in animal reservoirs and understanding the 
interface between host susceptibility is critical for predicting and preventing the potential threat of bat 
CoVs to public health. 

Published by Elsevier B.V. 


1. Introduction 

Coronaviruses (CoVs) are large, enveloped viruses contain¬ 
ing linear, positive-sense, single-stranded RNA genomes. Their 
genomes range approximately from 27- to 32-kb in length and con¬ 
tain 7-14 open reading frames (ORFs) (Woo et al., 2009a). Six major 
ORFs encoding polymerase complex (ORFla and ORFlb), spike gly¬ 
coprotein (S), envelope protein (E), membrane glycoprotein (M), 
and nucleocapsid protein (N) are present in all CoVs (Poon et al., 
2005). In addition, up to seven putative accessory ORFs and one 
ORF encoding hemagglutinin-esterase glycoprotein (HE) are inter¬ 
spersed between the six major ORFs. The numbers and sizes of these 
accessory ORFs differ markedly among CoVs (Woo et al., 2009a). 

CoVs have been identified from a broad range of birds and 
mammals including humans in which they can cause respiratory, 
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enteric, hepatic and neurologic diseases of varying severity (Weiss 
and Navas-Martin, 2005). CoVs in the subfamily Coronavirinae are 
classified into three genera, Alphacoronavirus, Betacoronavirus, and 
Gammacoronavirus (former serogroups 1 -3) (de Groot et al., 2011 ). 
Alpha- and beta-coronaviruses have been exclusively isolated from 
mammals and majority of gamma-coronaviruses from birds. CoVs 
of a distinctive lineage were recently detected from birds and pigs 
(Chu et al., 2011; Woo et al., 2009b, 2012) and have been proposed 
to belong to a new genus, provisionally named Deltacoronavirus (de 
Groot et al., 2011). The finding that the outbreak of severe acute res¬ 
piratory syndrome (SARS) in early 2003 was caused by a novel CoV 
(SARS-CoV) has boosted interest in the search for novel CoVs in 
humans and animals. At least 30 previously unrecognized distinc¬ 
tive CoVs from human and various animal reservoirs were reported 
during recent years, including SARS-related CoVs and CoVs from 
all genera in the subfamily Coronavirinae which have significantly 
expanded our understanding of CoV diversity and complexity (Woo 
et al., 2009a). Based on available data, bats appear to harbor a great 
diversity of CoVs. The frequency and diversity of CoV detection 
in bats, now in multiple continents, suggest that bats are likely a 
source for CoV introduction into other species globally and possibly 
play an important role in the ecology and evolution of CoVs. 
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Recently we reported the identification of 41 divergent CoVs in 
bats from Kenya, based on limited ORFlb sequences (Tong et al., 
2009). These newly discovered bat CoVs were grouped into 8 dif¬ 
ferent phylogenetic clusters. Of these, five clusters belonged to 
previously identified Alphacoronavirus genus, and three clusters 
belonged to previously identified Betacoronavirus genus, includ¬ 
ing a SARS-related CoV lineage. In the present study, we expand 
our sequence data for seven CoVs, representing 7 of the 8 dis¬ 
tinctive clusters we identified in Kenya bats during 2006 summer 
(Tong et al., 2009). The sample representing the eighth cluster of a 
SARS-related CoV was a weak positive and had limited specimen 
amount, therefore further sequencing studies were not included 
in this analysis. The purpose of our study was to further charac¬ 
terize the genomes and refine the phylogenetic relationships of 
these seven CoVs with other CoVs, based on the ORFs lb, S, E, M, 
and N. 

2. Materials and methods 

2.2. Bat sampling and RNA extraction 

Kenya was chosen as a major comparative Old World study 
location in Africa as part of the CDC Global Disease Detection 
program. Detailed information on bat capture and sampling is avail¬ 
able in the previous publication (Tong et al., 2009). The protocols 
for animal capture and use were approved by the CDC Animal 
Institutional Care and Use Committee and the Ethics and Animal 
Care and Use Committee of the Kenya Wildlife Service (Nairobi, 
Kenya). In brief, representative samples at each site were collected 
from bats of available species, including adult and juvenile of both 
sexes. After euthanasia, a complete necropsy was performed in 
compliance with the approved field protocols. Samples included 
blood, various organs (liver, lung, and kidney), rectal and oral 
swabs. 

In this study, seven CoV-positive rectal swabs were selected 
as representatives of the seven different phylogenetic groups 
(Tong et al., 2009) for extensive genome sequencing. These 
are Rousettus bat coronavirus/Kenya/KY06/2006 (BtKY06), 
Chaerephon bat coronavirus/Kenya/KY22/2006 (BtKY22), 
Eidolon bat coronavirus/Kenya/KY24/2006 (BtKY24), Min- 
iopterus bat coronavirus/Kenya/KY27/2006 (BtKY27), Miniopterus 
bat coronavirus/Kenya/KY33/2006 (BtKY33), Chaerephon bat 
coronavirus/Kenya/KY41/2006 (BtKY41), and Cardioderma 
bat coronavirus/Kenya/KY43/2006 (BtKY43). BtKY43 was not 
described previously, but represents a group of 4 Kenya bat CoVs 
(BtKY03, BtKY12, BtKY13, and BtKY29) (Tong et al., 2009). Total 
nucleic acids (TNA) were extracted by using the QIAamp MinElute 
Virus Spin Kit (Qiagen, Santa Clarita, CA) according to the man¬ 
ufacturer’s instructions from 200 p3 of phosphate buffered saline 
suspension of the rectal swab and homogenized organ tissues 
(liver, lung, and/or kidney) of each bat except for bats BtKY33 
and BtKY43 whose organ tissues were not available. The TNA was 
eluted in 80 juul DEPC-treated water and then stored at -80 °C. 

2.2. Reverse transcription-PCR (RT-PCR) 

Each CoV-positive result on the rectal swab included in this 
study was repeated from different TNA aliquots. The presence of 
CoV RNA in organ tissues of these bats was determined using 
the pan CoV RT-PCR assays as described previously (Tong et al., 
2009) and the sequence specific and/or group specific CoV RT-PCR 
assays (Table SI). The RT-PCR were performed as described pre¬ 
viously (Tong et al., 2009). Standard precautions were taken to 
avoid cross-contamination of samples before and after RNA extrac¬ 
tion and amplification. Purified DNA amplicons were sequenced 


with the RT-PCR primers on an ABI Prism 3130 automated capil¬ 
lary sequencer using a BigDye Terminator v3.1 Cycle Sequencing 
kit (Applied Biosystems, Carlsbad, CA). 

2.3. Partial genome sequencing 

High throughput 454 pyrosequencing on CoV RNA-positive 
bat samples was initially attempted, but failed to acquire any 
CoV-associated reads due to lower sensitivity. Therefore the RT- 
PCR-amplicon sequencing by Sanger chain-termination method 
was chosen in this study. Each of the seven contiguous sequences 
was obtained by using 4-6 pairs of semi-nested or nested consen¬ 
sus degenerate group specific primers and 4-7 pairs of semi-nested 
or nested sequence-specific bridging primers which generated a 
series of 8-13 overlapping fragments covering 15-16kb genomic 
sequences at the 3' end (Table SI ). The other half genome sequence 
containing the ORFla, was not recovered in this analysis due to 
the limited amount of rectal swab samples. Consensus degenerate 
primers of each group were designed from conserved sequences 
of known members of the corresponding sequence group or its 
close group based on CODEHOP strategy (Rose et al., 1998). The 
3' end of genome sequence was determined using the 3' RACE kit 
(Roche, Indianapolis, IN) according to the manufacturer’s instruc¬ 
tions. Semi-nested or nested primers were used to improve the 
PCR sensitivity. When nested primers were not available, the PCR 
product was re-amplified using the same RT-PCR primers. The 
RT-PCR reactions were performed with Superscript III one-step 
RT-PCR High Fidelity kit (Invitrogen, San Diego, CA) according to 
the manufacturer’s instructions, and the second round RCR reac¬ 
tions were performed with AccuPrime Taq DNA polymerase High 
Fidelity kit (Invitrogen, San Diego, CA). The RT-PCR products were 
visualized on 1% agarose gels containing 0.5 p,g/mL of ethidium 
bromide, and purified by QIAquick PCR purification kit (QIAGEN, 
Santa Clarita, CA). The RT-PCR amplicons for each sample were 
first sequenced with the consensus degenerate RT-PCR primers in 
both directions, and then the remaining internal gaps and 3' end 
genome were sequenced with sequence-specific bridging primers 
in both directions as described previously. The genomic sequences 
(ORFlb, S, ORF3, E, M, and N) of BtKY22, BtKY33, BtKY27, BtKY41, 
BtKY43, BtKY06, and BtKY24 were deposited in NCBI GenBank 
(HQ728480-HQ728486). 

2.4. Sequence analysis 

Sequences were assembled in Sequencher (Genecodes, Ann 
Arbor, MI). Each putative ORF was predicted using the NCBI 
ORF finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html). N- 
glycosylation sites were predicted using NetNGlyc 1.0 Server 
(http://www.cbs.dtu.dk/services/NetNGlyc/). BLAST analyses 
were performed against NCBI non-redundant protein database 
(Altschul et al., 1990) and against the Conserved Domain Database 
for protein classification (CDD) (Marchler-Bauer et al., 2005) to 
characterize the putative ORFs. 

Alignments of the seven Kenya bat CoV gene sequences with 
a representative set of 43 other CoV sequences, available in the 
public domain, were performed using the MUSCLE v3.6 (Edgar, 
2004). We constructed maximum likelihood trees for each gene 
alignment (ORFlb, S, E, M, and N) in MEGA software package 
v5.0 (Tamura et al., 2011) with 1000 bootstrap replications. We 
used General-Time-Reversible nucleotide (nt) substitution model 
with 4 categories of gamma distributed rate heterogeneity and 
a proportion of invariant sites (GTR + 74 + I). To identify potential 
recombination events of the seven Kenya bat CoVs, three methods 
implemented in recombination detection program RDP version 2 
(Martin et al., 2005) were used, including MaxChi (Smith, 1992), 
Chimaera (Posada et al., 2002), and Geneconv (Padidam et al., 1999). 
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Fig. 1 . Schematic representation of the genome organization of Kenya bat CoVs and representative alpha- and beta-coronaviruses. Shaded boxes represent open reading 
frames (ORFs) encoding structural proteins and unshaded boxes represent those encoding nonstructural proteins. 


Table 1 

Genomic features of open reading frames from seven bat coronaviruses and their putative transcription regulatory sequences (TRS). 


Genus 

Virus 

Alphacoronavirus 

BtKY27 

BtKY33 

BtKY22 

BtKY41 

BtKY43 

Betacoronavirus 

BtICY24 

BtKY06 

Sequences 3 (nt) 

15314 

15908 

15480 

15578 

15474 

16186 

16201 

ORFla(nt) 

NA b 

NA b 

NA b 

NA b 

NA b 

NA b 

NA b 

ORFlb (nt) 

C 

8022 

8025 

8025 

8025 

8022 

8040 

8067 

ORF size (nt) 

4128 

4152 

4071 

4161 

4095 

3795 

3837 

Putative TRS 

CUAAAU 

CUAAAU 

CUAAAU 

CGAAAU 

CUAAAU 

ACGAAC 

ACGAAC 

ORF3 

ORF size (nt) 

660 

672 

672 

687 

660 

717 

663 

Putative TRS 

u 

CGUUAC 

CGUUAC 

CGUUAC 

CUAGAC 

CUAAAC 

ACGAAC 

ACGAAC 

E, 

ORF size (nt) 

225 

225 

225 

231 

243 

228 

249 

Putative TRS 

M 

ORF size (nt) 

CUAUAC 

CUUUAC 

CUCUAC 

CUAGAC 

CUUUAC 

UCGAAC 

UCGAAC 

768 

780 

684 

690 

684 

666 

669 

Putative TRS 

N 

ORF size (nt) 

CUAAAC 

CUAAAC 

CUAAAC 

CUAAAC 

CUAAAC 

ACGAAC 

ACGAAC 

1185 

1296 

1263 

1227 

1182 

1404 

1407 

Putative TRS 

CUAAAC 

CUAAAC 

CUAAAC 

CUAAAU 

CUAAAC 

ACGAAC 

ACGAAC 

ORFx 

ORF size (nt) 


486 

231 

264 

288 

567 

558 

Putative TRS 


CAAAAU 

CUAAAC 

CUAAAU 

CUAAAC 

ACGAAC 

ACGAAC 

ORFy 

ORF size (nt) 

Putative TRS 

3' UTR (nt, excluding poly A) 

269 

222 

251 

195 

CUAAAC 

222 

221 

432 

ACGAAC 

231 

450 

ACGAAC 

217 


a Partial genome sequence starts from the first nt position in the RdRp to the end of genome. 
b NA, not available. 
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Fig. 2. Multiple amino acid sequence alignments showing the putative SI-S2 junctional region of CoV spike protein. The identical amino acids are highlighted in black and 
the similar amino acids are highlighted in gray. The regions containing SI GxCx motif, conserved S2 nonamer IPTNFSISI, the furin cleavage site (in MHV, HCoV OC43, and 
BCoV; underlined), and cathepsin L cleavage site (in SARS-CoV) are indicated. 


Events detected by all three methods with default parameters were 
considered as potential recombination events. 

3. Results and discussion 

3.1. Detection of CoV RNA in bat tissues 

The aliquots of bat rectal samples for BtKY27, BtKY33, BtI<Y22, 
BtKY41, BtI<Y43, BtI<Y24, and BtKY06 were confirmed positive by 
the pan CoV RT-PCR assay, while among tissues (liver, lung, and/or 
kidney) that were available from bats BtI<Y27, BtKY22, BtI<Y41, 
BtKY24, and BtKY06, only the liver from bat BtI<Y22 ( Chaerephon 
sp.) and the kidney from bat BtI<Y24 ( Eidolon helvum) tested posi¬ 
tive by RT-PCR. These data support an infection process rather than 
transit of ingested infected material through the digestive tract as 
the source of viral RNA in rectal swabs, particularly because these 
bat species do not feed on vertebrates. Negative results for other 
tissues may be explained by specific pathobiology and a limited 
tropism to the available tissues. 

3.2. Partial genome sequence and organization 

Each acquired CoV genome sequence covers the complete 
ORFlb, S protein, ORF3, E protein, M protein, N protein, other puta¬ 
tive ORFs after N and the 3' end untranslated region with a poly 
A tail. The genome organization and size for each of the ORFs are 


shown in Fig. 1 and Table 1, respectively. They are similar to other 
known CoV genome organization in the order of 5'-ORFl b, S, ORF3, 
E, M, and N-3 r , but have a variable number of putative ORFs down¬ 
stream of the N gene. The sizes of these seven genomic sequences 
from ORF1 b to the 3' end are between ~15k and ~16k and their G + C 
contents are between 37.6% and 42.6%. BtI<Y27 has no evidence of 
a putative ORF downstream of the N gene, but possesses a short 
untranslated region and poly-Atail similar to Bat-CoV 1 A(Chu et al., 
2008). BtKY22, BtI<Y33 and BtI<Y43 have one small putative ORF 
(76-161 amino acids (aa)) downstream of the N with no significant 
homology to previously described CoV ORFs. BtKY06 and BtKY24 
have two small putative ORFs downstream of the N with sequence 
similarity to NS7a and NS7b in Bat-CoV HKU9, respectively (Woo 
et al., 2007). BtI<Y41 has two small putative ORFs downstream 
of the N, which are overlapped and have no significant sequence 
homology to the previously described ORFs. 

Like most alphacoronaviruses, the BtI<Y27, BtKY33, BtI<Y22, 
BtKY41, and BtI<Y43 viruses share a core sequence S'-CUAAAC^ or 
similar putative transcription regulatory sequence (TRS) upstream 
of ORFs S, M, N, and ORFx and ORFy (Table 1 ) (Chu et al., 2008; Woo 
et al., 2005). ORF3 and E have putative core TRSs that sometimes 
varied from that for the other ORFs. The BtI<Y06 and BtKY24 have 
a core sequence TRS 5'-ACGAAC-3' in the upstream of each ORF 
except E which has a core sequence TRS S'-UCGAAC^' (Table 1 ). 

Spike proteins are the type I glycosylated membrane proteins, 
with a putative signal peptide at the N terminal. There are 31, 27, 


Table 2 

Pairwise sequence comparison of Kenya bat CoVs with their nearest known CoV species. 


Genus 

Kenya bat CoV 

% identity to nearest known CoV a 









3' genome* 3 

Nspl2 c 

Nspl3 c 

Nspl4 c 

Nspl5 c Nspl6 c 

S c 

E c 

M c 

N c 


BtKY27 

85 

97 

96 

94 

Bat-CoV 1A 

95 95 

87 

91 

93 

91 


BtKY33 

75 

93 

91 

90 

Bat-CoV 1A 

87 93 

62 

65 

75 

69 

Alphacoronavirus 

BtKY22 

71 

86 

88 

80 

Bat-CoV HKU8 

75 87 

56 

70 

79 

58 


BtKY41 

69 

80 

86 

79 

Bat-CoV/512/05 

75 86 

55 

63 

70 

57 


BtKY43 

69 

84 

88 

77 

Bat-CoV HKU8 

71 80 

53 

49 

73 

52 

Betacoronavirus 

BtKY06 

90 

>99 

99 

99 

Bat-CoV HKU9 

99 86 

83 

97 

95 

94 

BtKY24 

70 

87 

88 

82 

Bat-CoV HKU9 

69 79 

52 

57 

66 

66 


a The nearest known CoV species were chosen based on the blast search. 
b 3’ 15-16k genome nucleotide identity. 
c Amino acid identity. 
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Fig. 3. Phylogenetic analysis of ORFlb, S, M and N of bat CoVs from Kenya. The unrooted trees are constructed by Maximum likelihood method with 1000 bootstrap replications after ambiguous regions from alignments of ORFlb, 
S, M, and N are removed. The seven Kenya CoVs are highlighted with solid circles. The genus taxonomy information is shown to the right side of the phylogeny. The maximum likelihood bootstrap is indicated next to the nodes. 
The scale bar indicates the estimated number of nucleotide substitutions per site. 
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28, 25, 31, 20, and 19 potential N-glycosylation sites in BtI<Y22, 
BtKY27, BtI<Y33, BtKY41, BtI<Y43, BtKY24, and BtKY06, respectively. 
As shown in Fig. 2, spike proteins of the seven bat CoVs lack furin 
protease recognition site, such as RRADR-S in Murine Hepatitis 
Virus (MHV), RRSRG-A in human CoV OC43 (HCoV OC43), RRSRR-A 
in bovine CoV (BCoV) (Follis et al., 2006), and cathepsin L cleavage 
site (VAYT-M) as in SARS-CoV (Bosch et al., 2008). In spite of lacking 
conserved cleavage sites, they all consist of two domains, SI and S2, 
showing the conserved GxCx motif in SI around the cleavage site 
and the conserved nonamer motif IPTNFSISI or similar motif in S2. 
These motifs have been observed in other known CoVs (Follis et al., 
2006). The SI is responsible for virus binding to the receptor on the 
target cells and may contain receptor binding domains (RBDs) that 
directly bind to host cellular receptors. For example, the RBDs of 
HCoV 229E, TGEV, and HCoV NL63 in Alphacoronavirus are mapped 
at the C terminus of their SI domain (Bonavia et al., 2003; Godet 
et al., 1994; Lin et al., 2008). The RBDs of MHV and SARS-CoV in 
Betacoronavirus are mapped at N terminus and central region of SI 
domain, respectively (Li et al., 2005; Lin et al., 2008). Alignment of 
aa sequences of SI regions from BtI<Y22, BtKY27, BtI<Y33, BtI<Y41, 
and BtI<Y43 of Alphacoronavirus with the corresponding known RBD 
SI regions of HCoV 229E, TGEV, and HCoV NL 63 showed 33-41% 
identity in SI RBD domains to HCoV 229E and 24-29% identity to 
TGEV and HCoV NL63 (Fig. SI A-C). BtKY24 and BtI<Y06 from Beta¬ 
coronavirus are quite different in the corresponding RBD SI regions 
from SARS-CoV and MHV (17-19% identity) (Fig. S1D-E). The dis¬ 
similarity of SI regions of these bat CoVs to other CoVs may suggest 
their different host specificity. 

33. Phytogeny 

We constructed phylogenetic trees using maximum likelihood 
method based on nt sequences of ORFlb, S, E, M and N genes 
with representative viruses whose corresponding sequences of 
their genomes were available (Fig. 3). The phylogeny of E gene is 
not shown due to the short length and limited value for inferring 
species phylogenies. Similar topologies were observed in the phylo¬ 
genetic trees based on each of 5 ORFs (Fig. 3). The analysis revealed 
that among the seven bat CoVs, five belonged to Alphacoronavirus 
while the other two belonged to Betacoronavirus (Fig. 3). Phylo¬ 
genetic clusterings within Alphacoronavirus varied slightly when 
different genes were analyzed. For example, BtKY22 and BtI<Y43 
grouped into one monophyletic clade in ORFlb tree while they 
were grouped differently in the S and N gene trees with gener¬ 
ally insignificant bootstrap values (Fig. 3). Although recombination 
was suspected, we found no evidence of recombination in the seven 
analyzed viruses using MaxChi (Smith, 1992), Chimaera (Posada 
et al., 2002), and Geneconv (Padidam et al., 1999). Since the analyses 
were based on representatives from each CoV species, the results 
suggest a lack of inter-species recombination in these viruses. One 
explanation is that the recombination frequency decreases signif¬ 
icantly when the sequence divergence is high (Kleiboeker et al., 
2005; van Vugt et al., 2001 ). Alternatively, the lack of inter-specie 
recombination is due to rare co-infections as the viruses adapted 
to different bats species. Therefore, the phylogenetic incongruence 
observed in the gene trees is probably due to low phylogenetic 
signals, which may be improved by sampling more CoVs that are 
related to BtKY22 and BtKY43. 

The pairwise nt comparisons among these seven bat CoV gene 
sequences revealed 67-76% overall nt identity. Among the five 
alphacoronaviruses, three (BtI<Y22, BtI<Y41 and BtI<Y43) were 
distantly related to other known alphacoronaviruses with only 
69-71% overall nt identity and with <90% aa identity in all five 
conserved domains (nsps 12-16) of ORFlb (Table 2). Since we 
were not able to obtain all the genome portions necessary for def¬ 
inite species classification (de Groot et al., 2011), we adopted the 


separation criteria based on the RdRp group units (RGU) (Drexler 
et al., 2010). The aa distances in the 816 bp fragment of the 
RdRp gene from the Kenya bat CoVs described in this study were 
compared to the aa sequences from their close reference viruses 
(Table S2). 

BtI<Y22, BtKY41, and BtKY43 had >4.8% aa distance in the RdRp 
fragment (Table S2). This suggests that they are most likely three 
distinctive alphacoronvirus species. BtKY27 and BtI<Y33 identified 
in Miniopterus bats were closely related to Bat-CoV 1A, which was 
identified from bent-winged Miniopterus bat in Hong Kong (Chu 
et al., 2006) with 85% and 75% overall nt identity and with >90% aa 
identity in 5/5 and 4/5 conserved domains (nsps 12-16) in ORFlb, 
respectively (Table 2). BtKY27 and BtKY33 had <4.8% aa distance 
in the 816 bp RdRp to their close reference viruses indicating that 
they are members of the established bat CoV species in Alphacoro¬ 
navirus. 

As for the two members of Betacoronavirus genus identified, one 
(BtKY06 identified in Rousettus aegyptiacus bat) was likely a mem¬ 
ber of Bat-CoV HKU9 species identified from Rousettus leschenaulti 
bat in China (Woo et al., 2007), sharing 90% overall nt identity 
and 99% aa identity in 4/5 conserved domains (nspsl2-16) in 
ORFlb (Table 2). The other (BtKY24) was distantly related to other 
known betacoronaviruses with <70% overall nt identity and <90% 
aa identity in all 5 conserved domains (nsps 12-16) from ORFlb 
(Table 2). Additionally, based on the RGU criteria, BtKY24 had >6.3% 
aa distance in the 816 bp RdRp fragment compared to its closest 
reference virus indicating that it is most likely a distinctive beta¬ 
coronavirus. 

In conclusion, sequence data for the structural and non- 
structural ORFs in the 3'-end of the genome of seven Kenya bat 
CoVs confirmed the high diversity and their phylogenetical place¬ 
ment into Alphacoronavirus and Betacoronavirus genera. The four 
clusters of Kenya bat CoVs represented by BtKY22, BtKY41, BtKY43, 
and BtKY24 respectively, most likely belonged to novel CoV species, 
the two clusters represented by BtKY27 and BtKY33 were likely 
members of Bat-CoV 1A, and the cluster represented by BtKY06 
was likely a member of Bat-CoV HKU9 species. As noted with other 
novel CoVs, the genome organization is similar but differences were 
found in the number of putative ORFs downstream from the ORF 
N. The present results are in line with previous findings of exten¬ 
sive diversity of CoVs detected in bats and confirm that bat CoVs 
mainly belong to the Alphacoronavirus and Betacoronavirus genera 
(Lau et al., 2005, 2007; Tang et al., 2006; Woo et al., 2007, 2009b). 
Consistent with other reports, none of the bat CoVs characterized in 
the present study was sufficiently similar to the human SARS-CoV 
and other human CoVs to be suggested their direct progenitors. 
The examples of host switching among CoVs after relatively minor 
sequence changes in SI domain of spike protein (Haijema et al., 
2003; Kuo et al., 2000; Qu et al., 2005) suggest the potential risks 
for introduction into humans as occurred with SARS-CoV. Therefore 
characterization of novel CoVs and understanding species diver¬ 
sity in animals should help understand and respond to emerging 
zoonotic infections. 
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